C2C++(TM) Phone: South Africa-21-75-9197 Fax: South Africa-21-72-8005 COMPUSERVE: 70732,3352 CONTACT: John Viveiros PUBLISHED BY: CompuSource PO Box 510, Constantia Cape Town, South Africa, 7848 TRADEMARKS: C2C++(TM) and C2CPP(TM) are trademarks of CompuSource. IBM CSet++(TM) is a trademark of IBM Corporation. COPYRIGHT AND DISTRIBUTION You may distribute this manual freely provided that you do so with the software program C2CPP.ZIP which must include the ORDER.FRM file. Other than the above exclusion, CompuSource reserves all its rights under international copyright treaties worldwide. PAGE 1 CONTENTS 1. Introduction 3 2. Buyer's Decision Guide Summary 3 3. Ordering C2C++ 3 4. Try before you Buy 4 5. Pricing 4 6. Support Availability 4 7. Installation 4 7.1 System requirements 4 7.2 Installing C2C++ 5 8. Licensing Information 5 9. Overview 6 9.1 What C2C++ will do 6 9.2 What C2C++ will not do 6 9.3 Accuracy 6 9.4 Examining the sample source code 6 10. Translating your program 7 10.1 Syntax of available options 7 10.2 The C2CPP command line 7 10.3 Command line options 8 10.3.1 File Specification Options 8 10.3.2 Flag Settings 8 10.3.3 Source Code Options 9 10.3.4 Other Options 9 10.4 Preparing your source code 10 10.5 Your source code conventions 10 10.6 Defining classes of functions 11 10.7 Tutorial 11 10.7.1 Determining what to translate 11 10.7.2 Preparation 12 10.7.3 Deciding on C2CPP Parameters 12 10.7.4 Invocation of C2CPP 14 10.7.5 Compiling the translated C++ code 15 10.8 Tips 15 11. Advanced topics 16 11.1 The four phases of translation 16 11.2 The program description database 16 11.2.1 Database Tables 17 11.2.2 Database Fields 17 12. Appendix A - Errors 18 12.1 Error message format 18 12.2 List of error messages 18 12.2.1 Fatal Errors 18 12.2.2 Errors 19 12.2.3 Level 1 Warnings 23 12.2.4 Level 2 Warnings 24 12.2.5 Informational Messages 24 13. Appendix B - Cost Analysis 25 13.1 Cost of Manual Translation 25 13.2 Cost of Automatic Translation 25 14. Bibliography 25 15. Your Suggestions 26 PAGE 2 1. INTRODUCTION C2C++ has proved a highly cost-effective conversion tool over many large projects that would otherwise have taken hundreds of man-years to complete. Recognizing both the benefits of object orientation and the challenges involved in converting C to C++, makes C2C++ an indispensable tool. 2. BUYER'S DECISION GUIDE SUMMARY As a general guideline we would like to suggest the purchase of C2C++, if the cost of a manual conversion is at least three times that of the purchase price and training costs of C2C++, which would establish a cutoff point at 8,700 lines of source code. Beyond this it makes compelling financial sense to purchase and utilise C2C++. Details regarding how the above information was derived can be found in the section Appendix B - Cost Analysis. Are you aware that an automated tool named C2C++ can achieve up to 95% accurate translations (not just recompilations) of existing C code into object-oriented C++ code? CompuSource has provided the following information so that you can make an intelligent choice as to whether it is necessary for you to read any further. a) Are you considering a move to object orientation? b) Are your programs' source code written in C? c) Are you considering using C++ to take advantage of object orientation? If any of your answers to the above questions are no, then C2C++ is not really meant for you. If your answers have been yes, reading further could be of tremendous benefit to you and your organisation. a) Would you derive benefit from being able to use your existing C source code from within your C++ source code? b) Is your C++ source code hampered by the existing C code having been written in a procedural, non-object-oriented manner? c) Have you considered the costs of rewriting the existing C source code into object- oriented C++ source code? The success of C2C++ has been proven on a project of almost 500,000 lines of C source code. It has been our experience that large programs are converted by C2C++ with a higher percentage of accuracy than are small programs. 3. ORDERING C2C++ Only once you are completely satisfied with the evaluation, FAX or send your order through CompuServe. We have provided the file called ORDER.FRM with the software. You can edit it with any standard editor. Please complete this form in full so that we can process your details quickly. We prefer you to send this order form via CompuServe. If you are sending your order via CompuServe, please make sure that it is sent SENSITIVITY CONFIDENTIAL. ORDERING VIA COMPUSERVE: COMPUSERVE: 70732,3352 If you do not know how to send mail with sensitivity confidential, rather fax the order through for security. ORDERING VIA FAX: Fax: South Africa 21-72-8005 PAGE 3 4. TRY BEFORE YOU BUY To achieve the widest distribution channel with the most user satisfaction we will market C2C++ on a try before you buy policy. This is how it works: There is a limited copy of the software on CompuServe called C2CPP.ZIP which you need to download to purchase C2C++. You are able to use this on software prior to placing your order to evaluate the success of C2C++. You must have a copy of PKUNZIP to extract the files from C2CPP.ZIP. This is available on most services like CompuServe. 5. PRICING Prices are quoted in US Dollars. Seat Licenses range from $1,495 - $1,095 (plus shipping). Seats Cost per seat 01-05 $1495.00 06-10 $1395.00 11-15 $1295.00 16-20 $1195.00 20+ $1095.00 Unfortunately, there are no academic discounts, since C2C++ is meant to be used as a commercial tool. 6. SUPPORT AVAILABILITY Off site support is available to users of the unlimited version. Off-site support is available by fax or CompuServe E-Mail (this is the preferred and quickest method) to designated contact persons within the purchaser's company. Support is limited to queries regarding usage of the C2C++ product. On-site support is available only where CompuSource has a direct involvement in consultancy or the conversion. Consultancy services and on-site support would be separately negotiated and would mean consultation or complete project implementation by CompuSource. 7. INSTALLATION 7.1 SYSTEM REQUIREMENTS Operating system requirements: OS/2 2.x with its sophisticated memory management and Boot manager facility which allows multiple native operating system environments is our operating system of choice. Once the code has been converted, you are then able to simply transfer this code onto virtually any platform in the industry. C2C++ is not limited to translating OS/2 specific source code, it can handle source code from any platform providing it satisfies certain requirements specified later in this manual. Memory requirements: 8-16 megabytes OS/2 2.x runs best with 16 megabytes, however if you have 8 megabytes, OS/2 2.x's swapper technology uses the hard drive as extra virtual memory. Disk requirements: OS/2 2.x 40 megabytes C2C++ 2 megabytes working storage up to 5 * your code size megabytes PAGE 4 CPU requirements: 386 or higher Compiler requirements and prerequisites: Preprocessed COPIES of your original C source code files which contain #line directives and a .i file extension. We recommend the IBM CSet++, however any ANSI 3 compliant compiler should be able to handle the task. Editor requirements: None. Please us your favourite one. 7.2 INSTALLING C2C++ You need the PKUNZIP.EXE utility before you begin. This utility can be found on almost every bulletin board system, as it is a shareware program. Create the directory C2CPP on any drive of your choice where you would like to store the C2C++ program and its files. Copy C2CPP.ZIP into the C2CPP directory. Extract the files from C2CPP.ZIP into the \C2CPP directory using PKUNZIP.EXE You can parameterize where your source code resides. The translated code can also be parameterized to be placed on any drive within any directory. The sample source code resides as follows: You will notice that there is a sub-directory called \C2CPP\SAMPLE with the sample program COLUMN.EXE and its source code: COLUMN.C, COLUMN.H, TRANSFRM.C, TRANSFRM.H, FILEIO.C, FILEIO.H, PARAMS.C, PARAMS.H There is also a sub-directory called \C2CPP\SAMPLE\CPP with the translated sample code: COLUMN.CPP, COLUMN.HPP, TRANSFRM.CPP, TRANSFRM.HPP, FILEIO.CPP, FILEIO.HPP, PARAMS.CPP, PARAMS.HPP 8. LICENSING INFORMATION The program is licensed, not sold. You are responsible for the selection of the program, its installation, its correct use and its results. C2C++ is provided to you "AS-IS" without any warranty or guarantee of any kind, other than to replace a corrupted or damaged file if C2C++ is purchased. The shareware version of the program is provided to you solely for the purpose of assisting you in the evaluation of the purchase of the non-shareware version of C2C++. CompuSource shall not be liable for any damages arising out of the using of C2C++ software. You may only use this program on one machine at any one time. You are responsible for the payment of any taxes resulting from this licence. CompuSource makes no warranty that your C program will be able to be translated by C2C++. PAGE 5 9. OVERVIEW 9.1 WHAT C2C++ WILL DO C2C++ translates C source code into object-oriented C++ source code. C2C++ works particularly well on large programs, since large programs typically enforce structure. C2C++ has a major advantage over manual translation in that it can achieve a global overview of all of the source code in your program and can apply the changes automatically and consistently. Given a C source code program which makes use of structs to organise its variables, C2C++ will translate that program into a restructured C++ version which will take advantage of many object-oriented C++ facilities with a high degree of accuracy. With enough time, the expert human C++ programmer could do a better job of translating the C code into C++ code, but even under such situations, the automatic nature of C2C++ will allow for the speeding up of such conversion projects. C2C++ is not simply a program that makes old C code compile under a C++ compiler, it actually extends and restructures your program to use and take advantage of C++ classes, thereby making it more object-oriented, BUT... C2C++ will not transform poorly written, unstructured C programs into well written structured C++ programs. 9.2 WHAT C2C++ WILL NOT DO C2C++ has some limitations regarding C source code which it is not able to translate. The current list of known problem cases is: a) It cannot create good C++ code out of bad C code. b) It does not convert unions, or structs containing unions, into classes. c) It does not handle asm blocks or Microsoft "based" pointers. d) It does not output static class data definitions within a translated cpp module. e) Some macro expansions may not be fully translated, especially if the token pasting operator '##' is used. 9.3 ACCURACY Typically 90-95% of the original source code can be translated without any need for modification. Projects containing highly inconsistent code can be best tackled by database type manipulations of the intermediate results generated by C2C++ after each phase of its analysis. In addition, if manual modifications are necessary, it is suggested that the original source code be run through a source code formatting program prior to conversion. C2C++ outputs a message trace of the conversion process that includes warning and error messages regarding C code that it either cannot handle or that would result in invalid C++ code. Working through this log file would eliminate the majority of any outstanding conversion problems. The remaining problems are then detected by running the converted code through an ANSI compliant C++ compiler. 9.4 EXAMINING THE SAMPLE SOURCE CODE The purpose of this section is to discuss the sample program and the way in which it was translated into C++. PAGE 6 The sample program is named COLUMN and is a useful utility in its own right. This sample program is provided to you solely for the purpose of assisting you in the evaluation of C2C++. The sample program code is provided to you "AS IS" without any warranty of any kind. CompuSource shall not be liable for any damages arising out of the use of such sample code. You can either view the sample code and its translation as-is, or you can run C2C++ on your system to re-generate the translated C++ code for yourself to verify that C2C++ functions correctly on your system. In order to run C2C++ on the sample code: Change directory to "\C2CPP\SAMPLE" and run the batch file "SAMPLE.CMD". The sample source code demonstrates that C2C++: a) can correctly generate translated C++ code, including all the modifications necessary for member function calls, and expressions involving class data members, b) can automatically categorise functions into classes and assign them an appropriate visibility scoping level, c) can process multiple module programs (it can also translate programs spread across many directories), d) can handle and eliminate conflicts such as the same name being used for a function parameter or local variable name as that used by a data member within that function's class, e) can produce a well-organised standard C++ header file layout, making it easier to read, f) can detect and provide for inter-module header file cross-references of classes, g) can in general simplify your code through usage of C++ features instead of being limited to C. There are many more features of C2C++ that are described elsewhere in this manual, and also many others you will only recognise through experimentation with C2C++ on your own source code. 10. TRANSLATING YOUR PROGRAM 10.1 SYNTAX OF AVAILABLE OPTIONS a) Parameterised elements are enclosed in angle brackets <>. b) When you have a list of items from which you can choose one, the logical or symbol | separates the items. 10.2 THE C2CPP COMMAND LINE a) You can invoke C2CPP like any other OS/2 program, such as from an OS/2 command line, or using a .CMD file. b) To translate multiple C files, specify them one after the other on the command line. c) Instead of specifying options on the OS/2 command line, you can use a response file as input. A response file is a text file that contains a list of options and filenames to be passed as parameters to C2CPP. d) The command strings in the response file can span multiple lines, where the end of each line is treated as an option separator. e) You can nest another response file within an existing response file. f) To include comments within the response file, use a hash character #, which specifies the rest of the line is to be treated as though it were a comment. PAGE 7 g) A response file can have any valid filename and extension, except for embedded spaces. Its name should be preceded by an at sign @ on the command line. No space is allowed between the at sign @ and the filename. 10.3 COMMAND LINE OPTIONS The command line options are not case sensitive and can be specified in any order provided that all applicable options are specified prior to the file to be translated. 10.3.1 FILE SPECIFICATION OPTIONS Option Description Default Changing Default /d= Define the directory (none) Use wildcard remapping for characters to specify translated source the new location of code. translated source code files. /e<+|-> Whether #include files /e+* /e+ includes are translated. Translates all files. /e- excludes Files are included or excluded according to the wildcard pattern. 10.3.2 FLAG SETTINGS Option Description Default Changing Default /f<+|->error Controls when C2CPP /f+error /f-error quits after finding an Quit after the first Continue processing error. file with an error. regardless of errors. /f<+|->export Whether classes are /f+export /f-export exported. All classes are Classes are declared declared with the without the Export Export keyword. keyword. /f<+|->cons Output of /f+cons /f-cons constructors. Declare a constructor Don't declare a for each class constructor. /f<+|->des Output of destructors. /f+des /f-des Declare a destructor Don't declare a for each class destructor. /f<+|->copy Output of copy /f+copy /f-copy constructors. Declare a copy Don't declare a constructor for each copy constructor. class. /f<+|->assn Output of assignment /f+assn /f-assn operators. Declare an assignment Don't declare an operator for each assignment operator. class. /f<+|->prefix Examine function name /f+prefix /f-prefix prefix to help Ignores first word determine class. of function name when determining its class. /f<+|->suffix Examine function name /f+suffix /f-suffix suffix to help Ignores last word determine class. of function name when determining its class. /f<+|->middle Examine function name /f+middle /f-middle inner section to help Ignores inner words determine class. of function name when determining its class. PAGE 8 /f<+|->rename Allow renaming of old /f+rename /f-rename struct names for class Generates new names Makes the class name naming. for classes based on the same as the initial struct name. capitalisation and lower case. 10.3.3 SOURCE CODE OPTIONS Option Description Default Changing Default /g<+|-> C preprocessor command (none) /g+ keeps .i file. line. /g- deletes .i file afterwards. If a .i file is not present, C2CPP executes this command to generate it, replacing %f with the name of the C file. /h= Special treatment of All headers default Levels can be private, headers with a given to private, until protected or public. file extension. C2CPP can assign a dynamically determined scope level. /ri Defines an identifier (none) The word is ignored as a C language during parsing of the extension reserved C program. word. /t Expansion of tab /t3 Tabs are expanded to characters. Tabs are every 3 every n characters. characters. This option affects error diagnostics only. C filename to translate (none) The specified filename into C++ (no wildcards) is treated as a C source code module and is translated into C++ according to the current translation phase. 10.3.4 OTHER OPTIONS Option Description Default Changing Default /? Output help on options. (none) /p Switches to a new /p0 /p1 translation phase. Syntax check only. Analyses modules and classes. /p2 Analyses functions. /p3 Outputs modified C++ source code. /p4 Outputs generated C++ header files. /s Stores the database Database files are Database files are files in the indicated stored in the current stored in the directory. specified directory. PAGE 9 /w Suppress messages. /w5 Suppresses messages No messages are which are less severe suppressed. than level n (range 1 to 5). @ Filename containing (none) The filename is read more command line and appended to the switches. current command line. Nested inclusions are allowed. 10.4 PREPARING YOUR SOURCE CODE Your source code must comply with ANSI-standard C syntax and be syntax error free before submitting it to C2C++. This does not limit you to translating code that is OS/2 specific. Any source code written for any operating system platform that complies with the minimum source code layout standards described in this manual can be submitted to C2C++. We highly recommend that you recompile your original C source code with your compiler set to output the maximum level of warning messages prior to running C2C++. This will enable you to remove any errors or problems with your original C source code prior to translating it to C++. You must be able to invoke a preprocessor on your code to produce a preprocessed file with a '.i' extension in the same directory as your original '.c' file. The '.i' files must contain #line directives to specify the original source code location. It is also preferable to instruct the preprocessor to preserve comments within the '.i' file. Your code layout must obey some simple standard rules in order for C2C++ to be able to preserve features such as comments above data structures in the translated source code. You must have enough free disk space to store the translated code and header files as well as the database files which describe your code. As a rule of thumb, if your source code occupies 1 megabyte, the translated C++ will also occupy 1 megabyte, the preprocessor files will occupy between 1 and 3 megabytes and the C2C++ database will occupy another 1 megabyte, giving a typical maximum of 5 times the original source code size. The translated files can be placed in any specified directory tree on any drive, as can the C2C++ database files. 10.5 YOUR SOURCE CODE CONVENTIONS Your code must obey certain coding conventions: a) None of your modules may have a file extension of cpp or hpp otherwise a name collision will result with the translated code. b) If macros are to be modified by the conversion program, they should expand to no more than 4 lines of C source code, otherwise they may cause the output of warning messages regarding code not being found. c) For function parameters and variable declarations, the type and the variable name must appear on the same source code line. d) Any include file that needs conversion must be enclosed in quotes "" not angle brackets <> when #include'd into a source file. e) All functions and structs within a set of modules to be converted must be given unique names. f) All typedefs within a set of modules to be converted must be given unique names or must have the same textual definition. g) The function definition statement block must start on a new line with an open curly brace { as its first character. h) The name of the function must be on the first line of its declaration and must be followed by an open round bracket ( on the same line. i) Non-aggregate typedefs should appear on a single line of code and should not be accompanied by any other code on that same line. PAGE 10 j) Function declarations and definitions must begin on a new line. k) Struct data member declarations within a struct declaration must each begin on a new line. l) For struct declarations, the first line must contain the struct keyword and the opening brace { and the last line must contain the closing brace } and any related typedefs. m) No duplicate module names are allowed. Most of these requirements are a matter of good style and most programs should need little or no modification to comply with them. For cases where many modifications would be necessary, it is suggested that a source code layout program be purchased which will automate the process. 10.6 DEFINING CLASSES OF FUNCTIONS The data type of the first parameter of a function is used by default to determine the function's class, so re-arrangement of function parameters will improve class recognition. For functions which are not able to be classified according to their first parameter, the database is queried to determine the default class of the module in which it was found. If the function's class still cannot be determined, the name of the function itself is analysed to help determine its class. This name analysis is controlled by the /f+prefix, /f+middle and /f+suffix command line options. The method is as follows: The function name is broken up into sections separated where the first letter of a word is capitalised or by underbar _ characters. The name of each section is then compared against all candidate class names and if a match is found, the function is categorised into that class. The priority order is to analyse the prefix, then the suffix and finally any words in-between. For example, if the function name is "DefineSQLOutput", the name sections would be "Define", "SQL", and "Output". As another example, "_Control_is_visible" would split up into "Control", "is" and "visible". Finally, if all of the above fail, the name of the module and then the name of the directory in which the module resides are also matched against class names. If a matching class is still not found, the function is classified as global. Should C2C++ incorrectly classify any module or function, users of the unlimited version can modify the decision made by C2C++ by editing the program description database. Refer to the Advanced Topics section below for more details. 10.7 TUTORIAL 10.7.1 DETERMINING WHAT TO TRANSLATE C2C++ analyses your program in its entirety, so all source code that constitutes your program should be submitted to C2C++ for translation. If you miss out modules, the C++ classes will be incomplete and functions may be given incorrect scope. The easiest way to list all the files within a project is to run "dir *.c /f /s >file.lst" which will list the full path name of all C source code files in the current directory and below. You do not need to include header files in the list of files to translate because they are automatically translated when they are included into the source code files. PAGE 11 10.7.2 PREPARATION It is advisable to set up all your commonly used command line options and include them into a response file named (for example) "c2cpp.var". A good example of such a file is included in the "sample" directory. You should be able to make use of this file with minor modifications to the "/d", "/e" and possibly "/g" command line switches to translate your own source code. Your source code should obey the ANSI-C syntax and comply with the layout requirements described earlier. To check that your source code is capable of being understood by C2C++, you should run translation phase 0 which is simply a syntax check that does not modify your source code in any way. Following on with our example, you would invoke C2C++ as follows: "C2CPP @c2cpp.var /p0 @file.lst" and then check the message output for any errors. If you encounter syntax errors caused by language extensions implemented by your compiler, you can instruct C2C++ to ignore compiler-specific keywords by listing them in the c2cpp.var file using /ri command line switches. If your error is not caused by a compiler-specific keyword, please contact Technical Support for advice. 10.7.3 DECIDING ON C2CPP PARAMETERS An example of what C2CPP command line switches (parameters) to use is given in the "sample\c2cpp.var" file. Typically every such parameter file will contain the following families of switches: "/d", "/e", "/g" and (for users of the unlimited version) "/s". All of the other switches can either be safely left at their default settings, or can be copied directly from the "sample\c2cpp.var" file. 10.7.3.1 THE "/D" SWITCH The "/d" switch is very important because it defines the location of the translated C++ code. We strongly suggest that the translated code be redirected to a separate directory tree if there are multiple directories involved, or to a single sub-directory (such as in the sample) if only one directory is involved. The directories in which the translated files are to be placed should be created before C2C++ is run. For both sides of the "=" character within the "/d" switch, wildcard characters "?" and "*" may be used. If wildcards are used, they must match in number and type on both sides of the "=" character. For example: "/d?:\c2cpp\sample\*=?:\c2cpp\sample\cpp\*" will map the translation of all C source files in the "\c2cpp\sample" directory and its sub-directories into the "\c2cpp\sample\cpp" directory and its sub-directories, on the same drive as the original C source code. "/Dc:\yourcode\*=d:\yourcode\*" will map the translation of all C source files in the "c:\yourcode" directory into the "d:\yourcode" directory and its sub-directories. "/Dc:\c\*=c:\cpp\" is an error because the number of "*" characters does not match. "/D*\old.*=*\new.*" will rename all "old" files to be named "new" and keep them in the same directory. You may specify more than one directory mapping switch to be able to map multiple sets of files to different locations. PAGE 12 10.7.3.2 THE "/E" SWITCH The "/e" switch works on a similar basis for specifying which C header files should be translated. The "/e" is followed by either "+" or "-" to either include or exclude files according to the following file specification pattern respectively. Again you may specify multiple "/e" switches, each one of which cumulatively redefines the existing set of inclusions or exclusions. For example: "/e-* /e+?:\c2cpp\sample\*" will only translate those headers which reside in the "\c2cpp\sample" directory and its sub-directories, regardless of the drive. "/E-?:\ibmcpp\* /e-?:\toolkt21\*" will translate all header files except for those contained in either of the "\ibmcpp" or "\toolkt21" directories and their sub-directories. We recommend that you use a "/e" switch of the form: "/e-* /e+c:\yourcode\*" as this is the most likely to precisely define the correct set of headers to translate. 10.7.3.3 THE "/F" SWITCHES For small projects we recommend using: "/f-cons /f-des /f-copy /f-assn" to suppress output of C++ constructors, destructors, copy constructors and assignment operators. This will allow easier initial compilation of the translated C++ code. For larger projects we suggest using: "/f+cons /f+des /f+copy /f+assn" because it is good C++ style to define these functions and operators for every class. If your classes are to be compiled into a DLL, it may be useful to use "/f+export" which declares each class in its ".hpp" file to have the "EXPORT" keyword. If your existing struct names are all in upper case you might consider using "/f+rename" to generate a new set of class names based on the original struct names, but using mixed case. If you have used a function name convention in your original C source code which specifies that the function name should include the name of the struct which it manipulates, you should enable the following flags: "/f+prefix /f+middle /f+suffix" because a higher percentage of functions will then be correctly categorised into their classes. 10.7.3.4 THE "/G" SWITCH The "/g" switch varies from one C compiler to another. The one given in the "sample\c2cpp.var" file is the best one to use for IBM CSet++ because it will generate the ".i" file in the current directory, while preserving all comments and including "#line" directives. If you use another compiler, replace this line with your own. Note that you can make use of batch files if necessary. We suggest that you use "/g+" if you have enough disk space to store all of the preprocessed files because this means the C preprocessor will only be invoked once instead of three times (for translation phases 1 to 3). If you have limited disk space, use "/g-" because this option implies that only one preprocessed file will reside on the disk at any point in time. Be aware that inclusion of files such as the OS/2 PM or Microsoft Windows header files can result in large preprocessed C source files. PAGE 13 Within the "/g" command line, you should use "%f" to specify the location of where the filename of the current C source code module should be substituted. This filename has no drive or directory specification because the current drive and directory are set to be the same as the C source code file prior to invocation of the C compiler's preprocessor. 10.7.3.5 THE "/H" SWITCH If your original C program has adopted some C++ concepts already, you can take advantage of this by segregating function definitions of private, protected and public functions into separate header files with different file extensions. The "/h" switch allows you to specify which extension is used for what level of scope. For example: "/hh0=private" specifies that all header files with the extension ".h0" will have their functions categorised as private to their determined class. Irrespective of this setting, if C2C++ detects that a function which is currently declared private is actually being used as though it were public, it will change the scope of that function appropriately. 10.7.3.6 THE "/RI" SWITCH This switch is used to specify compiler-specific reserved keywords. A sample set of such keywords is included in the "sample\c2cpp.var" file. The keywords are not upper and lower case specific, but they may only contain characters which would constitute a valid C identifier. All of the IBM CSet++ version 2.1 keywords have been pre-declared within C2C++, so there is no need to extend the list for this compiler. 10.7.3.7 THE "/S" SWITCH This switch is only applicable to the unlimited version. It is good practice to set this switch when you wanting to convert more than one program from C to C++ as it allows you to keep separate databases for each such program. The name following the "/s" switch is treated as a directory name and should not contain any wildcards. The directory should exist prior to invoking C2C++. 10.7.4 INVOCATION OF C2CPP It is a requirement of C2C++ that all modules be translated through all four phases of translation. Each phase must be complete before the next begins. This means for example that you cannot run a module through phase 2 if you have run phase 3 on any modules already. Users of the shareware version are forced to invoke all four phases on the same command line by using a command line of the form: "C2CPP @c2cpp.var /p1 @file.lst /p2 @file.lst /p3 @file.lst /p4" but users of the unlimited version can invoke each phase separately. The advantage of executing each phase separately is that you can edit the program description database after each phase in order to fine-tune the translation. The command lines for each separate phase would be: "C2CPP @c2cpp.var /p1 @file.lst" "C2CPP @c2cpp.var /p2 @file.lst" "C2CPP @c2cpp.var /p3 @file.lst" "C2CPP @c2cpp.var /p4" PAGE 14 Note that phase 4 does not require a file list. It is advisable to completely re-run C2C++, starting with phase 1, after each change to the original C source code. This is essential if header files are modified. In addition, please remember to regenerate the preprocessor ".i" files after each source code change. 10.7.5 COMPILING THE TRANSLATED C++ CODE After all four phases of C2C++ translation are complete, you are ready to run your C++ compiler on the translated code. There are some situations which will cause syntax errors within the translated code. They fall into the following broad categories: a) Warnings issued by C2C++ during its translation of the original C source code: Do not ignore these warnings because they will typically point to problems that can easily be fixed prior to submitting the C++ code for compilation. b) Header file interdependencies: It is likely that your header files will have cross- dependencies, most of which C2C++ will have resolved through nested header file inclusion or appropriate forward declarations, but it is possible that some dependencies will remain. These dependencies are typically resolved through re- ordering the contents of header files or re-ordering the inclusion of other header files. c) Invalid expressions: C2C++ will accurately translate most expressions regardless of their complexity, but may fail if the expression is the result of a macro expansion. The remedy is either to simplify the original C source code and then re-run C2C++, or to manually edit the expression. d) Symbol not defined errors when linking: These typically occur because not all of the original C source code was submitted for translation. This typically results in functions being declared within C++ classes but never defined. The solution is either to submit the extra source code for translation by C2C++ or to exclude the affected headers from translation by C2C++. 10.8 TIPS a) Use the command line option /? to get help on command switches and find out the current version number. b) Put your standard C2CPP command line switches into a command file such as c2cpp.var and then always start each command line with: "C2CPP @c2cpp.var". This is useful to define sets of reserved words, directory mappings, etc. c) To process a lot of files which are in multiple sub-directories use "dir *.c /f /s >file.lst" and then make use of a command line of the kind: "C2CPP @c2cpp.var /p1 @file.lst /p2 @file.lst /p3 @file.lst /p4" d) To check that all files can be understood by C2C++, use: "C2CPP @c2cpp.var /p0 @file.lst" e) Make sure that you delete all ".csv" files from the database directory and all files from the translation output directories before starting a new conversion project because the database grows cumulatively and would therefore incorporate any existing code information. f) Make sure you ALWAYS specify a "/d" command line switch to map translated C++ files to a different directory, otherwise the output C++ code will potentially overwrite your original C code. g) Make sure you list all system include directories using the /e- switch. Alternatively, use /e-* to exclude all files and then use /e+ to selectively include files to be processed. This second approach is the one taken in the sample code. If you fail to correctly specify the system include or source code include directories, C2C++ will attempt to translate these other header files too and will subsequently fail because it will not find the corresponding C source code. PAGE 15 h) Unless you create all your preprocessed ".i" files up-front before invoking C2C++, you will need to specify a /g command line switch (using "%f" to substitute the current filename). Use /g+ if you have enough disk space to store the preprocessed version of every module to be translated, otherwise use /g- which will run slower but use less disk space, because it deletes the corresponding ".i" file after each phase of translation. i) In order to fully translate a program, you must run all modules through /p1 before running all modules again through /p2 and also through /p3. You don't need to specify any modules to run the final phase /p4. If you have the unlimited version, you can invoke C2C++ multiple times to separately process each translation phase or to separately invoke it on sets of modules within each translation phase. This modularity is achieved through the program description database which is stored on disk with the unlimited version, but kept in memory with the shareware version. C2C++ remembers the translation progress of every module it processes, so it is harmless to include the same module name twice on the command line. j) The default scope for all source code and header files is private, unless you use the /h command line switch to selectively change this for header files with given extensions, or unless you edit the program description database. C2C++ dynamically analyses the usage of every function in the program and automatically gives public scope to all functions that are used publicly. k) If you experience a large number of error or warning messages from C2C++, change the default /w5 command line switch to be /w2 which is normally sufficient and reduces the number of messages dramatically. We recommend that if you don't run with /w5, that you re-run C2C++ again after your source code has been fully translated, in order to pick up all the potential cases that C2C++ was not able to translate completely. 11. ADVANCED TOPICS This section is only applicable to users of the unlimited version. 11.1 THE FOUR PHASES OF TRANSLATION C2C++ progresses through four phases of translation of your original C source code, each of which progressively refines the program description database until a stage is reached where the translated C++ code can be output. Phase 1 has the responsibility of syntax checking all modules and compiling a list of all modules (including header files) to be translated. It also produces a list of classes to be created. Phase 2 analyses all of the functions in the program and decides in which class to categorise each function. Phase 3 outputs the translated C++ code, modifying function definitions, function calls and expressions as appropriate. Phase 4 outputs the generated C++ header files which detail the data and function members of each class. It is after phases 1 and 2 that the user has the opportunity to edit the program description database to refine the decisions already made automatically by C2C++. 11.2 THE PROGRAM DESCRIPTION DATABASE The program description database is stored in the directory specified by the "/s" command line switch. It is stored as a set of ".csv" files containing records with comma separated fields. This format was chosen to allow the broadest range of editors to be used, ranging from spreadsheets through to database programs and text editors. This section describes each database file (called a "table" in relational database terminology) and each of the user-editable fields within each file. Some fields are reserved for future enhancements and others are used internally and are hence not meant for external editing. PAGE 16 CSV Comma deliminated. You can see from the sample code how this has been implemented. Internal Internal means that these values must not change as they are used internally. Reserved Reserved for future enhancements, therefore do not change these values. 11.2.1 DATABASE TABLES File Purpose Module.CSV Table of original C source modules and headers. Class.CSV Table of C++ classes. ClsData.CSV Table of data members of C++ classes. Function.CSV Table of C++ functions. FuncDecl.CSV Table of C++ function declarations. Typedef.CSV Table of C typedef statements. 11.2.2 DATABASE FIELDS File Field name Field description Module.CSV ModuleName file name without extension Extension extension without '.' FileName full path name ClassName default class for functions Phase current phase of translation Class.CSV ID unique id number of the class TypedefName original typedef name of the struct StructName original struct tag name (internal) NewName new name for the C++ class (internal) ModuleName header in which the class will be defined (internal) (internal) (internal) (internal) (internal) (internal) (reserved) (reserved) (reserved) (reserved) (reserved) (reserved) ParentClass1 name of first C++ parent class ParentClass2 name of second C++ parent class UsedClasses names of classes referenced, separated by semicolons (internal) (internal) ClsData.CSV (all internal) Function.CSV ID unique id number of the function FunctionName new name of the C++ function ClassName class in which function resides ModuleName module in which function is defined Extension file extension of that module (internal) (internal) (internal) (internal) OldName the original C function name (internal) (internal) UsedClasses names of classes referenced, separated by semicolons (internal) PAGE 17 (internal) Visibility scope, where 2=public, 1=protected, 0=private (reserved) (reserved) (reserved) FuncDecl.CSV ID unique id number of the function Line line number within the original C source code Text modified C++ function declaration Typedef.CSV (all internal) 12. APPENDIX A - ERRORS 12.1 ERROR MESSAGE FORMAT For every translation job or job step, C2C++ generates a return code that indicates to the operating system the degree of success or failure it achieved. The meanings of the return codes are: Code Meaning 0 No error detected. Translation completed. OR Possible error (warning) detected. Translation completed. Successful execution probable. 1 Error detected reading the database files. 2 Error, severe error or fatal error detected during translation of the C source code. The C2C++ message format is: filename.ext(line:col): s ECSnnnn: text where: filename.ext - file name (with extension) where the error occurred line - line where the error occurred col - column where the error occurred s - error severity text nnnn - error message number text - message text explaining the cause of the error. All messages are output to the standard error output device, and hence can be redirected to a file by using the "2>file" command line command. 12.2 LIST OF ERROR MESSAGES 12.2.1 FATAL ERRORS 0001 Stack overflow The C source module is too complex. Try breaking it up into smaller modules. 0002 Out of heap space PAGE 18 There is insufficient space for your OS/2 SWAPPER.DAT file to expand. 0003 Internal error: failed to read text This is typically caused by a very long source code, at least 32000 characters long. 0004 Total failure to parse the C code The module being translated is probably not a C language program. 0005 Phase numbers must be in ascending sequence. You can only move on to the next translation phase once all previous phases have been completed. To backtrack, you will need to start at phase 1 again. 0006 Syntax is /dold=new The text of the /d command line switch is invalid. 0007 invalid /d option Either the number of wildcards is not consistent or there are more than 5 such wildcards. 0008 Internal error - please report to Technical Support. 0009 Limitations of shareware version exceeded. You have either specified an option available only to the users of the unlimited version, or you have a C program that exceeds the size limits placed upon the shareware version. 0010 Unable to open file %s The C code module probably does not exist. 0011 Use /g to define a preprocessor command A C source code module has been specified without a corresponding preprocessed ".i" file, and there is no preprocessor command specification to be able to generate that file. 0012 The preprocessor command failed with error code %i -or- The preprocessor command did not produce file %s The C compiler preprocessor failed to process your original C source code and produce the corresponding ".i" file. Try generating the ".i" file outside of C2C++. 12.2.2 ERRORS 1001 Unexpected end of file End of file was reached in the middle of a C language construct. Please check that your C source code is syntax error free. 1002 Illegal C language token C2C++ could not recognise that specific C language extension. Try rephrasing your code in ANSI standard C. 1003 Syntax error Please rewrite your code to use ANSI standard C. 1004 Two digit hexadecimal number expected A \x within a string should be followed by up to two hexadecimal numbers. PAGE 19 1005 Three digit octal number expected A \ followed by a digit within a string should be followed by up to two further octal numbers. 1006 \0 not allowed in the middle of a string A backslash followed by a 0 is not allowed in the middle of a string. It should only be used to define the character 0. 1007 Internal error Please report to Technical Support. 1008 Duplicate typedef for %s Two or more typedefs have been defined with the same name. C2C++ requires all typedefs to be distinct within a program. 1009 (reserved) 1010 Duplicate definition of class %s C2C++ requires that all structs be declared exactly once within a program. 1011 (not used) 1012 Inline code not supported for function %s. Some C compilers have extended ANSI standard C to allow inline code, but C2C++ does not allow this. 1013 Open brace for function %s must be the first character on the line. The formatting layout of the original C program is probably not as required. 1014 Failed to find open brace for function %s. The formatting layout of the original C program is probably not as required. 1015 Could not find function %s on the database. This was probably caused by user editing of the program description database. 1016 Could not find function call of %s. Calls to original C functions are replaced by calls to the equivalent C++ functions, but in this case, the original function call could not be located. 1017 NULL pointer for non-static member function %s This message indicates that the specified function should probably be declared as static, taking the class data pointer as its first argument, because C++ does not allow a NULL this pointer. 1018 Failed to change %s into 'this'. The variable specified could not be located within the original C source code. Try simplifying your code. 1019 Preprocessor directives not accepted within definition of function %s. C2C++ requires that function definitions and declarations not contain preprocessor directives between the opening round bracket and the opening curly brace. 1020 (reserved) PAGE 20 1021 Cannot read code file %s The specified C source file probably no longer exists. 1022 Could not find %s. The specified variable could not be found to be modified. Try simplifying the original C source code. 1023 Error: could not find module %s The module could not be modified because it no longer exists. 1024 Failed to read a code file line. An error occurred while reading the C source code file. 1025 Cannot write open the code file %s The file is probably locked by another process, or you do not have network permissions in that directory, or that directory does not exist. 1026 Failed to write file %s line %i. The disk is probably full. 1027 Failed to close the code file %s. This indicates an OS/2 file system error. 1028 Can't open source program file %s The file either does not exist or you don't have network permissions. 1029 Can't determine file size of %s This probably indicates a corrupted file system. 1030 Could not open database in file %s The specified database could not be opened because it probably does not exist anymore. 1031 Could not read from database in file %s The database file is corrupt, or has been edited so that it no longer conforms to comma separated variable requirements, or a field is missing. 1032 Could not create database in file %s. The file is probably locked by another process, or you do not have network permissions in that directory. 1033 Could not write to database in file %s The disk is probably full. 1034 Can't allocate memory to read file %s of size %i. OS/2 could not allocate sufficient virtual memory to read the file in its entirety. The remedy is to split your source code into smaller modules. 1035 Can't read entire file %s of size %i. An error occurred while reading the C source code file. 1036 Unknown phase %i. Only phase numbers from 0 to 4 inclusive are allowed. PAGE 21 1037 Unrecognised switch: %s Consult this manual for a table of allowed command line switches. 1038 Failed to find first argument of function %s The formatting layout of the original C program is probably not as required. 1039 Failed to remove argument(s) of function %s The formatting layout of the original C program is probably not as required. 1040 Failed to find call of function %s The formatting layout of the original C program is probably not as required. 1041 Function suffix not unique for %s The name of the function is too complex for C2C++ to break apart into its sections. 1042 Failed to rename call of function %s to %s The formatting layout of the original C program is probably not as required. 1043 Failed to remove call of function %s Calls to original C functions are replaced by calls to the equivalent C++ functions, but in this case, the original function call could not be located. 1044 Failed to find module class %s This is an internal error. Please report to Technical Support. 1045 Reserved function name %s being used. The new name of the C++ function is in conflict with a reserved function name. Please either rename the function in the original source code, or edit the program description database to remove the conflict. 1046 (not used) 1047 Could not locate line %i within file %s. This indicates that the line numbers specified by the C preprocessor with its #line directives do not correspond with the actual source code file. This typically occurs when the user modifies the original C source code but forgets to regenerate the preprocessor files afterwards. 1048 Could not replace line %i within file %s. This indicates that the line numbers specified by the C preprocessor with its #line directives do not correspond with the actual source code file. This typically occurs when the user modifies the original C source code but forgets to regenerate the preprocessor files afterwards. 1049 Could not delete line %i within file %s. This indicates that the line numbers specified by the C preprocessor with its #line directives do not correspond with the actual source code file. This typically occurs when the user modifies the original C source code but forgets to regenerate the preprocessor files afterwards. 1050 Could not create .HPP file for %s You either do not have network permissions in that directory, or that directory does not exist. PAGE 22 1051 Failed to read a header file line from %s An error occurred while reading the C header file. 1052 (reserved) 12.2.3 LEVEL 1 WARNINGS 2001 Multiple typedefs on one line. C2C++ has a restriction of analysing only one typedef per line. Please split the typedefs onto separate lines. 2002 Symbol %s is multiply defined. C2C++ requires that all type and struct names be distinct throughout a program. 2003 Assuming typedef is part of a struct definition. Notifies the user that the typedef will be associated with the struct's class. 2004 Could not find %s - probably caused by macro argument confusion. This error is typically the result of C preprocessor macro that outputs the same sub-expression twice or more. It is often benign because the macro argument would have been modified correctly the first time. 2005 (reserved) 2006 Failed to change %s into 'this' - probably caused by macro argument. This error is typically the result of C preprocessor macro that outputs the same sub-expression twice or more. It is often benign because the macro argument would have been modified correctly the first time. 2007 Could not determine a default class for module %s This notification message informs the user that C2C++ will not assign a default class to functions within that module unless the user edits the program description database appropriately. 2008 (reserved) 2009 (reserved) 2010 (reserved) 2011 Syntax is /ri or /rf Refer to this manual for the command line switch syntax. 2012 Syntax is /hext=level Refer to this manual for the command line switch syntax. 2013 Syntax is /e+pattern or /e-pattern Refer to this manual for the command line switch syntax. 2014 Syntax is /g+cmd or /g-cmd Refer to this manual for the command line switch syntax. 2015 Syntax is /f+ or /f- Refer to this manual for the command line switch syntax. PAGE 23 2016 Could not replace data type within variable declaration. The declaration of the data member is too complex for C2C++ to modify reliably. 2017 Could not replace data type of function %s, parameter %i The declaration of the function parameter is too complex for C2C++ to modify reliably. 2018 Name conflict between function and variable named %s, using fn_%s instead for the function. C++ requires that all class members have unique names. C2C++ handles conflicts between data and function members by prefixing the function name with "fn_". 2019 Deleting line %i, 1 beyond end of file %s. This typically occurs when the last line of the file is not terminated by a line feed character. Correct translation of the code is not affected. 2020 (reserved) 2021 (reserved) 12.2.4 LEVEL 2 WARNINGS 3001 (reserved) 3002 Ignoring severity level number out of range. Refer to this manual for the command line switch syntax. 3003 Using class %s from name %s This is a notification that the function's class had to be determined from its name. You might like to analyse all such decisions and edit the database to change C2C++'s decision. 12.2.5 INFORMATIONAL MESSAGES 4001 Processing file. Informs the user of each file as it is being analysed. 4002 Storing class: %s This is a notification that C++ header information is being output for the indicated class. 4003 Function: %s::%s This is a notification that C++ header information is being output for the indicated function. 4004 Function definition of %s This notification is output whenever C2C++ encounters another function definition within its analysis. 4005 File %s successfully modified. This notification is output during phase 3 when revised C code modules are output. 4006 Module: %s This message is output during phase 4 when the headers are generated for each module. PAGE 24 Please report all "Internal Errors" to Technical Support. 13. APPENDIX B - COST ANALYSIS If you are serious about investigating the possibilities of conversion, you will need to know the cost benefits of using C2C++, so we have included formulas below which will guide you regarding the costs of a manual translation versus an automated translation by C2C++. We have used letter symbols to indicate the variables in the formulas and have included a minimal cost scenario set of sample figures in brackets. 13.1 COST OF MANUAL TRANSLATION Programmer cost per month in US$ a ($5,000) Number of lines of code translated per day b (400) Number of productive days per month c (21) Number of lines of code in your program d Cost to manually convert the program is a * d / (b * c) 13.2 COST OF AUTOMATIC TRANSLATION Purchase price of C2C++ in US$ m ($1495) Number of days to learn how to use C2C++ n (1) Cost to automatically convert the program is m + a * n / c Based on the above pricing and productivity figures, the results for different size C source code programs are: Program size Manual cost Automatic cost Ratio 1,000 lines $ 595 $ 1,733 0.3x 10,000 lines $ 5,952 $ 1,733 3.4x 100,000 lines $ 59,524 $ 1,733 34.3x As a general guideline we would suggest the purchase of C2C++ if the cost of manual conversion is at least three times that of the purchase price and training costs of C2C++, which in our above scenario would establish a cutoff point at 8,700 lines of code, beyond which it makes compelling financial sense to purchase and utilise C2C++. The general formula for the breakeven cutoff point in lines of code, using a three times cost multiplier, is given by: 3 * b * ( n + c * m / a ) 14. BIBLIOGRAPHY "C++ Inside and Out " by Bruce Eckel. "Effective C++" by Scott Meyers, "The Annotated C++ Reference Manual" (also known as the ARM) by Bjarne Stroustrup and Margaret Ellis, "Advanced C++" by James Coplien PAGE 25 15. YOUR SUGGESTIONS CompuSource welcomes suggestions to improve C2C++. Only those improvements and enhancements which are both viable, and improve the quality of C2C++, will be considered for implementation as these are two prerequisites to producing a commercial product. Those suggestions that meet these requirements will be implemented with haste. PAGE 26